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Getting Started with Text Analysis Project 


Project Deliverables 


You will be required to provide the following deliverables. 


e A python notebook with your solution. 


Instructions 


Background Information 


The management of a certain Marketing Firm would like to track the sentiments of their 
customers. This would help in shortening the amount of time that it takes to act on 
feedback. 


Problem Statement 


Your task for this project will be to create a model that can predict whether the sentiment 
of a tweet is positive or negative. The desired accuracy of your model is 70%. 


Below are the text processing steps that you will be required to perform in this project: 


e Text Cleaning/ Text Processing 
o Removing all URLs/links 
Replacing @ and # Characters 
Feature Construction (No. of Punctuation Characters) 
Removing Punctuation Characters 
Feature Construction (Lowercase, Uppercase and Proper case words) 
Conversion to Lowercase 
Splitting Concatenated words 
Spelling Correction 
Feature Construction (Counting the no. of stop words/tweet) 
Removing Stop words 
o Lemmatization 
e Text Feature Engineering Techniques 
o Length of text 
o Word Count 
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Word density (Average no. of Words / Tweet) 

Noun Count 

Verb Count 

Adjective Count 

Adverb Count 

Pronoun Count 

Polarity 

Subjectivity 

Word Level N-Gram TF-IDF tweet_word_tfidf 
Character Level N-Gram TF-IDF tweet_character_tfidf 
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You can use the following Guiding Template [Link]. 
Dataset 


e Datasets for this project can be found here: [https://bit.ly/3 1kqByD]. 
e You can load the dataset from the URL. 


Dataset Source: Twitter 


